Republic of Ingushetia
The first open machine translation system for the Chechen language
Umishov, Abu-Viskhan A., Grigorian, Vladislav A.
We introduce the first open-source model for translation between the vulnerable Chechen language and Russian, and the dataset collected to train and evaluate it. We explore fine-tuning capabilities for including a new language into a large language model system for multilingual translation NLLB-200. The BLEU / ChrF++ scores for our model are 8.34 / 34.69 and 20.89 / 44.55 for translation from Russian to Chechen and reverse direction, respectively. The release of the translation models is accompanied by the distribution of parallel words, phrases and sentences corpora and multilingual sentence encoder adapted to the Chechen language.
- Asia > Russia (0.14)
- Europe > Russia > North Caucasian Federal District > Chechen Republic (0.14)
- Europe > Finland > Uusimaa > Helsinki (0.07)
- (5 more...)
The Knowledge Microscope: Features as Better Analytical Lenses than Neurons
Chen, Yuheng, Cao, Pengfei, Liu, Kang, Zhao, Jun
Previous studies primarily utilize MLP neurons as units of analysis for understanding the mechanisms of factual knowledge in Language Models (LMs); however, neurons suffer from polysemanticity, leading to limited knowledge expression and poor interpretability. In this paper, we first conduct preliminary experiments to validate that Sparse Autoencoders (SAE) can effectively decompose neurons into features, which serve as alternative analytical units. With this established, our core findings reveal three key advantages of features over neurons: (1) Features exhibit stronger influence on knowledge expression and superior interpretability. (2) Features demonstrate enhanced monosemanticity, showing distinct activation patterns between related and unrelated facts. (3) Features achieve better privacy protection than neurons, demonstrated through our proposed FeatureEdit method, which significantly outperforms existing neuron-based approaches in erasing privacy-sensitive information from LMs.Code and dataset will be available.
- Oceania > Australia (0.04)
- North America > Dominican Republic (0.04)
- Asia > Singapore (0.04)
- (19 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.95)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Knowledge Localization: Mission Not Accomplished? Enter Query Localization!
Chen, Yuheng, Cao, Pengfei, Chen, Yubo, Liu, Kang, Zhao, Jun
Large language models (LLMs) store extensive factual knowledge, but the mechanisms behind how they store and express this knowledge remain unclear. The Knowledge Neuron (KN) thesis is a prominent theory for explaining these mechanisms. This theory is based on the knowledge localization (KL) assumption, which suggests that a fact can be localized to a few knowledge storage units, namely knowledge neurons. However, this assumption may be overly strong regarding knowledge storage and neglects knowledge expression mechanisms. Thus, we re-examine the KL assumption and confirm the existence of facts that do not adhere to it from both statistical and knowledge modification perspectives. Furthermore, we propose the Query Localization (QL) assumption. (1) Query-KN Mapping: The localization results are associated with the query rather than the fact. (2) Dynamic KN Selection: The attention module contributes to the selection of KNs for answering a query. Based on this, we further propose the Consistency-Aware KN modification method, which improves the performance of knowledge modification. We conduct 39 sets of experiments, along with additional visualization experiments, to rigorously validate our conclusions.
- North America > Dominican Republic (0.04)
- Asia > Singapore (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (21 more...)